A Coherent Interpretation of AUC as a Measure of Aggregated Classification Performance
نویسندگان
چکیده
The area under the ROC curve (AUC), a wellknown measure of ranking performance, is also often used as a measure of classification performance, aggregating over decision thresholds as well as class and cost skews. However, David Hand has recently argued that AUC is fundamentally incoherent as a measure of aggregated classifier performance and proposed an alternative measure (Hand, 2009). Specifically, Hand derives a linear relationship between AUC and expected minimum loss, where the expectation is taken over a distribution of the misclassification cost parameter that depends on the model under consideration. Replacing this distribution with a Beta(2,2) distribution, Hand derives his alternative measure H. In this paper we offer an alternative, coherent interpretation of AUC as linearly related to expected loss. We use a distribution over cost parameter and a distribution over data points, both uniform and hence modelindependent. Should one wish to consider only optimal thresholds, we demonstrate that a simple and more intuitive alternative to Hand’s H measure is already available in the form of the area under the cost curve.
منابع مشابه
On the coherence of AUC
The area under the ROC curve (AUC) is a well-known measure of ranking performance, and is also often used as a measure of classification performance, aggregating over decision thresholds as well as class and cost skews. However, David Hand has recently argued that AUC is fundamentally incoherent as a measure of aggregated classifier performance and proposed an alternative measure [5]. Specifica...
متن کاملUncertainty Measurement for Ultrasonic Sensor Fusion Using Generalized Aggregated Uncertainty Measure 1
In this paper, target differentiation based on pattern of data which are obtained by a set of two ultrasonic sensors is considered. A neural network based target classifier is applied to these data to categorize the data of each sensor. Then the results are fused together by Dempster–Shafer theory (DST) and Dezert–Smarandache theory (DSmT) to make final decision. The Generalized Aggregated Unce...
متن کاملProposing a Novel Cost Sensitive Imbalanced Classification Method based on Hybrid of New Fuzzy Cost Assigning Approaches, Fuzzy Clustering and Evolutionary Algorithms
In this paper, a new hybrid methodology is introduced to design a cost-sensitive fuzzy rule-based classification system. A novel cost metric is proposed based on the combination of three different concepts: Entropy, Gini index and DKM criterion. In order to calculate the effective cost of patterns, a hybrid of fuzzy c-means clustering and particle swarm optimization algorithm is utilized. This ...
متن کاملIncreasing the accuracy of the classification of diabetic patients in terms of functional limitation using linear and nonlinear combinations of biomarkers: Ramp AUC method
The Area under the ROC Curve (AUC) is a common index for evaluating the ability of the biomarkers for classification. In practice, a single biomarker has limited classification ability, so to improve the classification performance, we are interested in combining biomarkers linearly and nonlinearly. In this study, while introducing various types of loss functions, the Ramp AUC method and some of...
متن کاملMeasuring classification performance: the hmeasure package
The ubiquity of binary classification problems has given rise to a prolific literature dedicated to the proposal of novel classification methods, as well as the incremental improvement of existing ones, that largely relies on effective performance metrics. The inherent trade-off between false positives and false negatives, however, complicates any attempt at providing a scalar summary of classi...
متن کامل